N-word-sequence Frequency Slm Based on Binomia

نویسندگان

  • Yibao Zhao
  • Guojun Zhou
چکیده

It is often difficult to build a robust Statistical Language Model (SLM) for a domain-specific spoken dialogue system because it’s very challenging to collect enough data for a specific domain. One solution is to build an SLM based on domainspecific grammar rules which do not need to collect a lot of data. A number of studies have found that this solution is effective and encouraging. However, the statistical information obtained from domain-specific grammar rules can’t correctly represent the distribution of n-word-sequences in real applications, and thus resulting in the undesirable performance. It is observed that the n-word-sequence frequency-of-frequency distribution obtained from general-purpose corpus has a smooth curve, while the n-word-sequence frequency-of-frequency obtained from domain grammar rules does not. Based on the assumption that each n-word-sequence in real applications normally follows a binomial distribution, this paper proposes a pair of n-word-sequence frequency smoothing algorithms called Coast Algorithm and Tide Algorithm, which can significantly mitigate the “noise” presented in n-word-sequence frequencyof-frequency directly obtained from domain-specific grammar rules. Our experiments with a domain-specific spoken dialog system show that the SLM generated from domain-specific grammar rules but smoothed using the Coast and Tide algorithms can reduce the TER (Tag Error Rate) by 13.02% (relative). Therefore, these two algorithms can improve the system performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improvement of a structured language model: arbori-context tree

In this paper we present an extention of a context tree for a structured language model (SLM), which we call an arbori-context tree. The state-of-the-art SLM predicts the next word from a xed partial tree of the history tree, such as two exposed heads, etc. An arbori-context tree allows us to select an optimum partial tree of a history tree for the next word prediction depending on the e ective...

متن کامل

A Structured Language Model Based on Context-Sensitive Probabilistic Left-Corner Parsing

Recent contributions to statistical language modeling for speech recognition have shown that probabilistically parsing a partial word sequence aids the prediction of the next word, leading to “structured” language models that have the potential to outperform n-grams. Existing approaches to structured language modeling construct nodes in the partial parse tree after all of the underlying words h...

متن کامل

M-ary Chaotic Sequence Based SLM-OFDM System for PAPR Reduction without Side-Information

Selected Mapping (SLM) is a PAPR reduction technique, which converts the OFDM signal into several independent signals by multiplication with the phase sequence set and transmits one of the signals with lowest PAPR. But it requires the index of the selected signal i.e. side information (SI) to be transmitted with each OFDM symbol. The PAPR reduction capability of the SLM scheme depends on the se...

متن کامل

Reduction Of PAPR Using HADAMARD SLM In SFBC MIMO- OFDM System

This paper contains the Hadamard Transform in the SLM for the reduction of high peak to average power ratio (PAPR) in MIMOOFDM systems. In this technique, the input sequence is multiplied by a set of phase rotation vectors respectively and then applies the Hadamard Transform to the each resulting sequence based on SFBC. After that perform the Inverse Fast Fourier Transform in order to get the t...

متن کامل

Deterministic Selection of Phase Sequences in Low Complexity SLM Scheme

Selected mapping (SLM) is a suitable scheme, which can solve the peak-to-average power ratio (PAPR) problem. Recently, many researchers have concentrated on reducing the computational complexity of the SLM schemes. One of the low complexity SLM schemes is the Class III SLM scheme which uses only one inverse fast fourier transform (IFFT) operation for generating one orthogonal frequency division...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002